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1  Numerical  Methods  in  Stochastic  Control 


The  second  edition  of  our  book  [4]  on  numerical  methods  in  stochastic  control  has  appeared.  The 
book  and  the  methods  contained  therein  are  now  the  standard  in  the  field.  It  contains  the  most 
comprehensive  development  of  numerical  algorithms  and  associated  convergence  proofs  for  a  large 
part  of  the  current  forms  of  stochastic  control  problems  in  continuous  time.  The  PFs  algorithms 
(and  proof  techniques)  are  the  algorithms  of  choice  for  the  bulk  of  continuous  time  stochastic  control 
problems.  In  addition  to  the  broad  coverage  of  the  first  edition,  it  gives  numerical  algorithms  and 
proofs  for  problems  where  the  variance  term  is  controlled,  and  for  jump-diffusions  where  the  jump 
is  controlled.  Important  applications  of  jump  control  occur,  for  example,  in  communications  theory. 
Consider,  for  example,  a  system  where  a  server  divides  its  time  between  several  queues  whose  input 
processes  are  bursty,  and  the  individual  connections  are  subject  to  random  breakdown  or  fading. 
The  control  problem  is  the  scheduling  of  the  server  and  this  must  be  done  continuously.  A  jump 
increase  in  the  total  system  workload  can  occur  when  some  connection  breaks  down  or  fades  and 
the  work  in  the  available  queues  is  less  than  the  server  can  handle,  but  customers  continue  to 
arrive  at  the  unavailable  queues,  so  there  is  undesired  idle  time.  The  control  policy  affects  the 
jump  sizes.  Traditional  methods  cannot  handle  such  problems.  The  standard  use  of  the  Poisson 
measure  driven  model  is  no  longer  adequate,  and  a  general  theory  is  developed.  Additionally,  the 
book  contains  a  thorough  development  of  deterministic  problems  that  arise  in  control  and  in  the 
calculus  of  variations,  and  includes  discontinuous  or  unbounded  dynamical  terms,  with  applications 
to  image  reconstruction,  large  deviations,  and  elsewhere.  The  algorithms  are  about  the  fastest  and 
most  stable  available,  and  there  are  convergence  proofs  for  all  of  them. 

Numerical  methods  in  stochastic  control  are  now  a  fundamental  tool  for  the  solution  and  inves¬ 
tigation  of  stochastic  problems,  whether  controlled  or  not.  In  any  particular  application,  one  is  not 
usually  interested  in  a  single  cost  criterion,  whether  it  is  the  mean  number  in  the  system,  the  mean 
waiting  time  or  anything  else.  There  are  usually  several  conflicting  criteria,  where  improving  one 
might  mean  hurting  another.  If  a  small  improvement  in  one  comes  at  the  expense  of  a  large  loss  in 
another,  then  it  will  be  unacceptable.  But  whether  or  not  any  particular  weighting  of  the  criteria 
will  yield  an  optimal  policy  that  is  an  acceptable  solution  (in  practice)  is  not  usually  known  before 
a  problem  is  solved.  A  very  useful  role  of  optimization  is  to  explore  the  possible  tradeoffs;  what 
happens  under  the  optimal  controls  for  various  given  sets  of  weights.  One  would  numerically  solve 
a  sequence  of  limit  problems  to  get  controls  that  are  optimal  under  different  weighings  of  the  basic 
criteria  of  interest,  to  get  the  general  structure  and  the  parametric  dependencies  of  the  controls 
and  costs.  The  resulting  quantitative  and  qualitative  information  and  systematic  exploration  of  the 
possible  tradeoffs  among  the  various  cost  components  can  be  extremely  useful  in  design,  as  seen  in 
the  study  of  a  communications  system  in  [5,  6].  This  approach  represents  a  significant  application 
of  our  numerical  methods  and  places  great  demands  on  them.  The  method  is  robust  and  simplifies 
the  analysis  (both  analytical  and  numerical). 

Numerical  Approximations  for  Stochastic  Differential  Games:  The  Ergodic  Case.  The 
Markov  chain  approximation  method  is  a  widely  used,  relatively  easy  to  use,  and  efficient  family  of 
methods  for  the  bulk  of  stochastic  control  problems  in  continuous  time,  for  reflected-jump- diffusion 
type  models.  It  has  been  shown  to  converge  under  broad  conditions,  and  there  are  good  algorithms 
for  solving  the  numerical  problems,  if  the  dimension  is  not  too  high.  In  [3]  we  consider  a  class  of 
stochastic  differential  games  with  a  reflected  diffusion  system  model  and  ergodic  cost  criterion  and 
where  the  controls  for  the  two  players  are  separated  in  the  dynamics  and  cost  function.  It  is  shown 
that  the  value  of  the  game  exists  and  that  the  numerical  method  converges  to  this  value  as  the 
discretization  parameter  goes  to  zero.  The  actual  numerical  method  solves  a  stochastic  game  for  a 
finite  state  Markov  chain  and  ergodic  cost  criterion.  The  essential  conditions  are  nondegeneracy  and 
that  a  weak  local  consistency  condition  hold  “almost  everywhere”  for  the  numerical  approximations, 
just  as  for  the  control  problem. 

Heavy  Traffic  Analysis  of  Controlled  Queueing  and  Communication  Networks.  Another 
major  achievement  was  the  appearance  of  this  new  book  [2].  It  is,  by  far,  the  most  comprehensive 
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on  the  subject.  It  provides  a  thorough  development  of  the  powerful  methods  of  heavy  traffic  analysis 
and  approximations  with  applications  to  a  wide  variety  of  stochastic  (e.g.,  queueing  and  commu¬ 
nication)  networks,  for  both  controlled  and  uncontrolled  systems.  The  approximating  models  are 
reflected  stochastic  differential  equations.  The  analytical  and  numerical  methods  yield  considerable 
simplifications  and  insights  and  good  approximations  to  both  path  properties  and  optimal  controls 
under  broad  conditions  on  the  data  and  structure.  The  general  theory  is  developed,  with  possibly 
state  dependent  parameters,  and  specialized  to  many  different  cases  of  practical  interest.  Control 
problems  in  telecommunications  and  applications  to  scheduling,  admissions  control,  polling,  and 
elsewhere  are  treated.  There  is  a  detailed  survey  of  reflected  stochastic  differential  equations,  weak 
convergence  theory,  methods  for  characterizing  limit  processes,  and  ergodic  problems. 

Stability  and  Control  of  Mobile  Communications  Systems  With  Time  Varying  Chan¬ 
nels.  Consider  the  forward  link  of  a  mobile  communications  system  with  a  single  transmitter  and 
rather  arbitrary  randomly  time  varying  channels  connecting  the  base  to  the  mobiles.  Data  arrives 
at  the  base  in  some  random  way  (and  might  have  a  bursty  character)  and  is  queued  according  to  the 
destination  until  transmitted.  The  main  issues  are  the  allocation  of  transmitter  power  and  time  to 
the  various  queues  in  a  queue-  and  channel-state  dependent  way  to  assure  stability  and  good  opera¬ 
tion.  The  control  decisions  are  made  at  the  beginning  of  the  (small)  scheduling  intervals.  Stability 
methods  are  used  in  [1]  to  allocate  time  and  power.  Many  schemes  of  current  interest  can  be  handled: 
For  example,  CDMA  with  control  over  the  bit  interval  and  power  per  bit,  TDMA  with  control  over 
the  time  allocated,  power  per  bit,  and  bit  interval,  as  well  as  arbitrary  combinations.  There  might 
be  random  errors  in  transmission  which  require  retransmission.  The  channel-state  process  might  be 
known  or  only  partially  known.  The  details  of  the  scheme  are  not  directly  involved;  all  essential 
factors  are  incorporated  into  a  “rate”  and  “error”  function.  The  system  and  channel  process  are 
scaled  by  speed.  Under  a  stability  assumption  on  a  model  obtained  from  the  “mean  drift,”  and  some 
other  natural  conditions,  it  is  shown  that  the  scaled  physical  system  can  be  controlled  to  be  stable, 
uniformly  in  the  speed,  for  fast  enough  speeds.  Owing  to  the  non-Markov  nature  of  the  problem,  we 
use  the  perturbed  Liapunov  function  method,  which  is  very  useful  for  the  analysis  of  non-Markovian 
systems.  Finally,  the  stability  method  is  used  to  actually  choose  the  power  and  time  allocations.  The 
allocation  will  depend  on  the  Liapunov  function.  But  each  such  function  corresponds  loosely  to  an 
optimization  problem  for  some  performance  criterion.  Since  there  is  a  choice  of  Liapunov  functions, 
various  performance  criteria  can  be  taken  into  account  in  the  allocations.  The  resulting  controls  are 
quite  reasonable.  The  power  of  the  method  is  due  to  the  rather  general  conditions  under  which  it 
works  and  the  reasonableness  of  the  controls. 
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Large  Deviation  Approximations  for  Occupancy  Problems.  The  paper  [1]  (with  C.  Nuzman 
and  P.  Whiting  of  Bell  Labs)  was  completed.  In  the  occupancy  problem  one  considers  the  distribu¬ 
tion  of  r  balls  in  n  cells,  with  each  ball  assigned  independently  to  a  given  cell  with  probability  1/n. 
In  the  paper  just  mentioned  a  large  deviation  approximation  as  r  and  n  tend  to  infinity  was  proved. 
Occupancy  problems  have  many  applications  in  computer  science,  mathematical  biology  and  else¬ 
where,  but  the  original  motivation  for  this  work  is  the  design  of  an  optical  communication  switch, 
where  blocking  probabilities  are  of  central  importance.  Here  the  urns  correspond  to  channels  and 
the  balls  to  packets  being  routed  through  the  switch.  In  order  to  analyze  the  problem  a  dynamical 
model  is  first  considered,  where  the  balls  are  placed  in  the  cells  sequentially  and  “time”  corresponds 
to  the  number  of  balls  that  have  already  been  thrown.  A  complete  large  deviation  analysis  of  this 
“process  level”  problem  is  carried  out,  and  the  rate  function  for  the  original  problem  is  then  obtained 
via  the  contraction  principle.  The  variational  problem  that  characterizes  this  rate  function  is  ana¬ 
lyzed,  and  (in  sharp  contrast  to  most  analyses  of  large  deviations  for  Markov  processes),  a  complete 
and  explicit  solution  is  obtained.  The  minimizing  trajectories  and  minimal  cost  are  identified  up 
to  two  constants,  and  the  constants  are  characterized  as  the  unique  solution  to  an  elementary  fixed 
point  problem.  These  results  are  then  used  to  solve  a  number  of  interesting  problems,  including  the 
overflow  problem  and  the  so-called  partial  coupon  collector's  problem. 

Regulation  and  Analysis  of  Stochastic  Networks.  Research  in  this  area  proceeded  along  sev¬ 
eral  directions.  Key  issues  in  our  investigations  included:  (i)  the  development  of  approximate  models 
that  allow  for  explicit  (or  nearly  explicit)  construction  of  the  optimal  routing/service  policies,  and 
(ii)  robustness  and  the  ability  to  deal  with  model  perturbations.  The  problems  considered  encom¬ 
pass  a  number  of  different  formulations,  each  of  which  emphasizes  different  qualitative  properties  of 
the  resulting  controlled  network.  These  include  risk-sensitive  control  and  the  control  of  rare  events, 
optimal  control  of  “fluid”  models,  and  optimally  robust  control  of  such  models.  “Fluid”  models  are 
approximate  models  that  are  obtained  under  a  law  of  large  numbers  scaling.  In  all  cases  the  system 
model  is  constrained,  and  in  most  problems  this  is  a  dominant  feature.  To  handle  the  constraints,  we 
model  the  state  dynamics  of  the  approximate  (or  limit)  models  in  terms  of  an  appropriate  Skorokhod 
Problem  and  corresponding  constrained  ordinary  or  stochastic  differential  equations.  The  different 
problem  setups  (e.g.,  control  of  rare  events  versus  control  of  fluid  models)  lead  to  problems  of  the 
same  basic  form,  which  is  a  Skorokhod  Problems  with  (relatively)  simple  dynamics  and  simple  cost 
structures. 

Paper  [2]  (joint  with  K.  Ramanan  of  Bell  Labs)  deals  with  a  controlled  constrained  ordinary 
differential  equation,  and  with  a  cost  that  depends  only  on  the  control.  Such  problems  occur  in 
the  (ordinary)  control  of  fluid  models,  such  as  control  of  a  network  with  the  objective  of  reducing 
backlogs  in  minimum  time.  Our  analysis  gives  an  explicit  finite-dimensional  representation  of  the 
value  function,  and  identifies  all  optimal  controls.  Reference  [3]  (also  with  Ramanan)  considers 
the  problem  of  characterizing  the  manner  in  which  rare  events  occur  in  networks  under  heavy 
traffic.  It  shows  how  time  reversal  arguments  can  be  applied  to  rewrite  the  control  problem  defined 
by  a  straightforward  large  deviations  analysis  into  the  form  of  [2],  and  then  shows  how  explicit 
finite-dimensional  solutions  can  be  obtained  for  some  interesting  classes  of  problems  (in  arbitrary 
dimension).  A  three  dimensional  example  is  worked  out  in  detail  for  illustrative  purposes. 

The  paper  [4]  considers  the  robust  optimal  control  of  a  law  of  large  numbers  approximation  of 
a  stochastic  network.  The  robust  control  problem  is  formulated  as  a  dynamic  differential  game, 
with  one  player  choosing  the  policies  that  determine  service  and  routing  assignments,  and  the  other 
choosing  quantities  such  as  the  arrival  and  service  rates,  subject  to  constraints.  The  cost  to  be 
minimized  by  the  first  player  and  maximized  by  the  second  is  the  time  till  the  origin  is  reached.  The 
robust  formulation  allows  one  to  differentiate  between  the  many  policies  (at  the  fluid  level)  that  are 
optimal  for  an  ordinary  cost.  An  explicit  formula  is  given  for  the  value  function,  and  some  of  its 
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basic  properties  are  studied.  The  problem  of  policy  synthesis  for  particular  classes  of  problems  has 
also  been  studied,  and  for  these  cases  we  have  explicitly  constructed  a  robustly  optimal  control. 

A  complementary  regulation  problem  is  considered  in  [5],  joint  with  R.  Atar  and  A.  Shwartz  of 
The  Technion.  This  paper  considers  the  problem  of  robust  and  risk-sensitive  control  of  a  stochastic 
network.  In  controlling  such  a  network,  an  escape  time  criteria  (rather  than  time  to  reach  the  origin) 
is  useful  if  one  wishes  to  regulate  the  occurrence  of  large  buffers  and  buffer  overflow.  A  risk-sensitive 
escape  time  criteria  is  formulated,  which  in  comparison  to  the  ordinary  escape  time  criteria  penalizes 
exits  which  occur  on  short  time  intervals  more  heavily.  The  properties  of  the  risk-sensitive  problem 
are  studied  in  the  large  buffer  limit,  and  related  to  the  value  of  a  deterministic  differential  game 
with  constrained  dynamics.  We  prove  that  the  game  has  value,  and  explicit  solutions  are  obtained 
to  illustrate  how  the  results  may  be  applied. 

Approximations  and  analysis  of  jump  processes  for  optimal  stopping.  The  papers  [6]  and 
[7]  are  both  joint  with  H.  Wang  of  Brown.  The  first  considers  a  basic  issue  in  most  problems  of 
optimal  stopping.  The  optimal  policy  is  usually  obtained  from  a  continuous  time  model,  since  it  is 
only  in  this  case  that  an  explicit  solution  to  the  corresponding  variational  problem  may  be  found. 
In  practice,  however,  one  may  be  restricted  to  stop  only  at  a  fixed  set  of  discrete  times.  In  this 
paper  we  identified  the  rates  of  convergence  of  both  the  optimal  costs  and  the  stopping  regions,  and 
provided  simple  formulas  for  the  rate  coefficients.  The  second  paper  considered  optimal  stopping 
problems  where  the  ability  to  stop  depends  on  exogenous  Poisson  signal  process  -  one  can  only  stop 
at  the  Poisson  jump  times.  Even  though  the  time  variable  in  these  problems  has  a  discrete  aspect, 
a  variational  inequality  can  be  obtained  by  considering  an  underlying  continuous  time  structure. 
We  derived  the  asymptotic  behavior  of  the  value  functions  and  optimal  exercise  boundaries  as  the 
intensity  of  the  Poisson  process  went  to  infinity,  or,  roughly  speaking,  as  the  problems  converge  to 
the  classical  continuous-time  optimal  stopping  problems. 
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